Home page

Deconstructing Deep Learning + δeviations

Drop me an email Format : Date | Title
  TL; DR

Total posts : 86

View My GitHub Profile


Index page

Universal Approximation theorem

What makes Neural Networks tick mathematically.

This is going to be a really short article but I decided it was important to talk about it. So the reason why NNs work is due to a theorem called the Universal Approximation theorem.

What is it

What this means that given an x and a y, the NN can identify a mapping between them. "Approximately". This is required when we have non linearly separable data. (Aka you can't split them directly into n parts just by say drawing a line between them. This could be complex structures like images or text or anything which cannot be directly modelled.

So we take a non linear function, for example the sigmoid. $$\frac{1}{1 + e^{ - \left( w^{T}x + b \right)}}$$. Then we have to combine multiple such neurons in a way such that we can accurately model our problem. The end result is a complex function and the existing weights are distributed across many layers. Sounds familiar? Welcome to Deep Learning (lol).

The Universal approximation theorem states that

a feed forward network with a single hidden layer containing a finite number of neurons can approximate continuous functions on compact subsets of $$\mathbb{R}$$ , under mild assumptions on the activation function.

Um. Can these guys speak normally. -.- Lets break it down a bit.

How does it work?

Well here is an image. See if you can understand whats happening.

Makes sense right? Every curve at an infinitely small point can be a collection of lines (approximately). Oh and as a form of citation, here is where I got this image from. Its a great blog you should really check it out.